Data science is the discipline of making data useful. Ok…so what is it?
Engineering (infrastructure and production): the process of making everything else possible
Analysis: the process of turning raw information into insights in a fast way
Modeling/Inference: the process of diving deeper into the data to discover the pattern we don’t easily see
(It is a group work from https://github.com/brohrer/academic_advisory/blob/master/authors.md !)
Data environment: data storage, Kafka platform, Hadoop and Spark cluster etc.
Data management: parsing the logs, web scraping, API queries, and interrogating data streams.
Production: integrate model and analysis into the production system
Domain knowledge
Exploratory analysis
Story telling
Statistical Inference
Supervised learning
Unsupervised learning
Customized model development
Excerpt from How Airbnb Democratizes Data Science With Data University:
Every company claims to be data driven, but they are different…
How do you want to be driven by data?
Team matters!
Happy teams are all alike; every unhappy team is unhappy in its own way
Product matters, but don’t join a company because of the product!
If you want to be Google employee #20, you need to join Google when it had only 19 employees